SPECIFICATION 

TO ALL WHOM IT MAY CONCERN: 

BE IT KNOWN THAT I, Atsuki Inoue, a citizen of 
Japan residing at Kawasaki, Japan have invented certain new 
and useful improvements in 

REDUCED SWING CHARGE RECYCLING CIRCUIT 
ARRANGEMENT AND ADDER INCLUDING THE SAME 

which the following is a specification : - 
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TITLE OF THE INVENTION 

REDUCED SWING CHARGE RECYCLING CIRCUIT 
ARRANGEMENT AND ADDER INCLUDING THE SAME 

5 BACKGROUND OF THE INVENTION 

1. Field of the Invention 

The present invention generally relates to 
a low power CMOS circuit, and particularly a low-power 
SOI device, 

10 2. Description of the Related Art 

Power reduction is important in modern VLSI 
design due to the increasing operating frequencies and 
circuit densities, and the emergence of new mobile 
applications such as portable terminals and consumer 

15 products . CMOS is one of the most low power logic styles 
because the circuits consume power only when the logic 
states change and it is widely used in the modern LSI. 
However, as the technology is scaling and the number 
of the transistors is increasing, the dynamic power 

20 consumption is increasing rapidly. Decreasing the 
supply voltage is the easiest way to reduce power 
consumption in CMOS circuits because switching power 
is proportional to the square of the supply voltage. 

However , reducing the supply voltage degrades 

25 circuit speed due to the super-linear reduction of 
transistor current. The voltage applied to transistor 
gate determines transistor conductance and larger 
conductance can charge up the output node faster. So 
if the supply voltage is reduced, the voltage applied 

30 to the gate is also reduced and thus significantly 
degrade the circuit speed. To recover this slow down, 
the reduction of threshold voltage of the transistor 
is effective. 

However, the reduction of the threshold 

35 voltage leads to increasing sub-threshold leakage 
current and the leakage current increases the stand-by 
power consumption of LSI, which is not acceptable for 
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the application of consumer products such as a portable 
terminal powered from a battery. Recently, since the 
threshold voltage of the transistor may be selected 
so as to be rather low, a further reduction of the 
threshold voltage may be difficult. 

Another technique for lowering power 
consumption without reducing the supply voltage is to 
lower a swing voltage. In conventional low swing 
voltage circuits , dynamic low swing drivers are used 
and during the evaluation of the logic, at least one 
of the net or signal becomes floating. In the design 
of the data path, this net usually becomes long and 
a lot of other nets go over the net. A coupling noise 
to these nets easily causes failure of slow down of 
the circuits. 

SUMMARY OF THE INVENTION 

Accordingly, it is an object of the present 
invention to provide a new low power CMOS circuit, in 
which the supply voltage and/or the threshold voltage 
is not reduced, and power reduction can be achieved 
without the cost of degrading the circuit speed. 

It is another and more specific object of the 
present invention to provide an SOI device utilizing 
the low-power CMOS circuit. 

In order to achieve the above objects 
according to the present invention , a circuit is proposed , 
in which the supply voltage is not reduced, but only 
swing voltage is reduced. As the power consumption is 
proportional to the swing voltage, dynamic power 
reduction is expected by reducing the swing voltage. 
Since the supply voltage is not reduced, the voltage 
applied to the gate is not reduced and the circuit speed 
is maintained. The charge recycling scheme is also 
applied to the circuits. In typical CMOS circuits, all 
the charge stored at the output node is dumped to ground 
when the logic state changes. A charge recycling 



circuit can re-use the charge stored in the previous 
cycle and can reduce the power consumption by half. 

The circuit according to the present 
invention has a configuration using both reduced swing 
voltage and charge recycling techniques. This scheme 
is called Low Swing Charge Recycling (LSCR) style. 

In order to achieve the above ob j ect , a circuit 
arrangement is disclosed, which includes: 

a complementary pass transistor logic; 

a static driver connected to the 
complementary pass transistor logic and driving 
complementary input nodes to each other of the 
complementary pass transistor logic by a low swing 
voltage ; and 

a charge recycling circuit connected to the 
complementary pass transistor logic and performing 
charge sharing between the complementary input nodes 
when the complementary pass transistor logic is. not 
driven by the static driver . 

According to the present invention, since the 
static driver, can reduce the swing voltage while 
maintaining the supply voltage and the charge recycling 
circuit can reduce the charge provided from the supply 
by half, the circuit can lower the power consumption 
without being suffered from circuit speed degradation. 
Furthermore, since the driver of the present invention 
is static, all nets or signals stay static during 
evaluation of the logic. In other word, all the nets 
have a path to the supply or the ground during evaluation 
and they are robust against coupling from other nets . 

With the above-described circuit arrangement , 
a swing level of the low swing voltage ranges from a 
ground voltage level to a supply voltage level minus 
a threshold voltage level. 

Moreover, with the above-described circuit 
arrangement, the static driver is formed of a plurality 
of transistors connected in series. 



In order to achieve the object, a low swing 
charge recycling circuit arrangement is disclosed, 
which includes: 

a complementary pass gate stage having 
driving inputs to receive each of driving input signals , 
having complementary outputs to produce an output signal 
on one hand and a complementary output signal on the 
other and determining a logic operation of the circuit 

arrangement ; 

a static low swing driver stage having a signal 
input to receive an input signal, having a clock input 
to receive a clock signal, and having complementary 
outputs to produce low swing complementary signals to 
each output to be provided to the driving inputs of 
the complementary pass gate when the clock signal is 
in one of two states; and 

an equalization stage being connected to the 
complementary outputs , having a clock input to receive 
the clock signal and producing complementary signals 
to the driving inputs of the complementary pass gate 
stage when the clock signal is in the other state , whereby 
a charge shared signal of an intermediate voltage level 
between those of the complementary outputs is shared 
between the driving inputs. 

With the above-described circuit arrangement , 
the static low swing driver stage is operated to reduce 
a swing voltage applied to a source of the complementary 
pass gate stage without changing the supply voltage , 
so that the power consumption can be saved. Since the 
level of the driving input signals to the pass gate 
stage for logic operation does not have to be lowered, 
the circuit speed can be maintained without degrading 
a driving performance for the transistor. When the 
static low swing driver stage is not operated the 
complementary outputs of the drive are closed, the 
equalization stage is alternatively operated and allows 
the charge sharing between the driving inputs of the 
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pass gate stage, and thus resulting in the power 
consumption reduction. The equalization stage 
performs the charge sharing between the complementary 
driving inputs by connecting the driving inputs and 
5 pre-charges both driving inputs to a certain 

intermediate voltage between a driving voltage level 
and a ground level. This allows an effective logic 
signal swing to be approximately a half of that of the 
circuit without charge recycling and results in low 

10 power consumption. 

With the above-described circuit arrangement, 
the driver stage is designed to be static . As a result , 
the drive stage is operated such that all the nodes 
are driven by the supply potential or ground potential 
15 during the evaluation, and thus the driver stage has 
no floating nodes . Therefore , the circuit arrangement 
will not be likely to cause malfunction nor signal delay. 

It may be particularly advantageous that the 
above-mentioned low swing charge recycling circuit 
20 arrangement is applied to ah SOI device. The SOI 
transistor is fabricated on insulator and has less 
parasitic capacitance. This is a good feature for 
achieving low power, because the excessive parasitic 
capacitance do not need to charge /discharge . The body 
25 of these devices is isolated from each other and cannot 
be contacted with common node without any area penalty . 
Then the device is usually used as body floating. A 
floating body device has less body effect than body 
contacteddeviceorordinarybulkdeviceandshows higher 

30 switching speed, because the body voltage follows the 
gate voltage during turning the transistor on. Pass 
transistor gates and stacked transistors get a great 

benefit from this feature. 

However, the floating body voltage fluctuates 
35 during the circuit operation and causes "history" effect. 
The time constant of this phenomenon is enough larger 
than usual cycle time in the circuit and the body voltage 



changes every cycle, which brings the delay fluctuation 
in the circuit. 

In contrast, with the circuit arrangement 
according to the present invention, both the source 
and drain nodes of the pass transistors are always , 
precharged, or equalized, to certain voltages and body 
voltage also can be set at the similar voltage before 
operation. This feature can suppress body effect. 
Floating body still gives the speed benefit due to less 
body effect. Thus, the inventive circuit arrangement 
is more suitable for SOI devices. 

In order to achieve another object of the 
present invention, an SOI adder formed by a low swing 
charge recycling circuit arrangement is disclosed. The 
advantages of the adder according to the present 
invention over the prior art adder with respect to 
circuit speed and power consumption are illustrated 
by means of a below simulation. 

The adder according to the present invention 

includes : 

a carry propagating circuit for alternatively 
propagating low swing driven complementary carry input 
signals and charge sharing complementary carry input 
signals ; 

a static low swing driver circuit receiving 
generate signals and producing low swing driven 
complementary generate signals; 

a pass gate network receiving the 
complementary carry input signals, the complementary 
generate signals and propagate signals and being 
controlled by the propagate signals for producing a 
sum signal by applying XORoperation to the complementary 
carry signals with the propagate signals; 

an equalization circuit adapted to be 
operative alternatively with the static low swing driver 
circuit and providing charge sharing complementary 
generate signals to the pass gate network; and 



a latch circuit connected to the pass gate 
network and latching the produced sum signal. 

This adder is provided with the same features 
as those of the above-described low swing charge 
recycling circuit arrangement. 

Furthermore, it is advantageous to connect 
this adder in series in order to achieve an adder module 
with any number of bits. 

An adder module according to the present 
invention includes: 

at least one adder connected in series, each 
adder being provided on the basis of one bit to be added; 
and 

a carry input signal equalization circuit 
receiving carry input signals and providing charge 
sharing complementary carry input signals to one end 
of the adders connected in series, 

wherein the adder includes: 

a carry propagating circuit for alternatively 
propagating low swing driven complementary carry input 
signals and the charge sharing complementary carry input 
signals; 

a static low swing driver circuit receiving 
generate signals and producing low swing driven 
complementary generate signals; 

a pass gate network receiving the 
complementary carry input signals, the complementary 
generate signals and propagate signals and being 
controlled by the propagate signals for producing a 
sum s ignal by applying XOR operation to the complementary 
carry signals with the propagate signals; 

an equalization circuit adapted to be 
operative alternatively with the static low swing driver 
circuit and providing charge sharing complementary 
generate signals to the pass gate network; and 

a latch circuit connected to the pass gate 
network and latching the produced sum signal. 



The above-described adder module further 

includes : 

a carry propagating path for propagating the 
complementary carry input signals in series of bits; 

a carry skip path bypassing the adders 
connected in series in order to pass the complementary 
carry input signals transparently; and 

a carry conflict-free circuit for protecting 
a conflict of the propagated carry input signals and 
the passed carry input signals. 

Other objects and further features of the 
present invention will be apparent from the following 
detailed description when read in conjunction with the 
accompanying drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Fig ♦ 1 illustrates a basic concept of low swing 
voltage driven technique; 

Fig, 2 illustrates a basic concept of charge 
recycling technique; 

Fig. 3 shows a structure of a conventional XOR 

gate;* 

Fig, 4 shows a structure of a low swing charge 
recycling XOR gate according to an embodiment of the 
present invention; 

Fig . 5 shows a timing diagram of a conventional 
LSDD structure; 

Fig. 6 shows a timing diagram of a low swing 
charge recycling circuit according to an embodiment 
of the present invention; 

Fig. 7 shows a structure of a low swing driver 
accordingof a first embodiment to the present invention; 

Fig „ 8 shows a structure of the low swing driver 
according of a second embodiment to the present 
invention; 

Fig. 9 shows a structure of the low swing driver 
accordingof a thirdembodiment to the present invention ; 
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Fig. 10 shows a structure of a 1-bit adder 
according to an embodiment of the present invention; 

Fig. 11 shows a structure of a 4-bit carry skip 
adder module according to an embodiment of the present 

5 invention ; 

Fig. 12 shows a structure of a multi-bit adder 
according to an embodiment of the present invention; 

Fig. 13 shows a chip structure of a 64-bit adder 
according to an embodiment of the present invention; 
10 pig . 14 shows a graph illustrating an add delay 

of the contacted body and floating body 64-bit adders ; 

Fig. 15 shows a graph illustrating the add 
delay of the various adders as a function of the power 
voltage ; 

15 Fig. 16 shows a graph illustrating the power 

consumption of the various adders as the function of 

the power voltage; and 

Fig. 17 shows a graph comparing the power 
consumptions between LSDD and LSCR techniques. 

20 

DETAILED DESCRIPTIO N OF THE PREF ERRED EMBODIMENTS 

in the following, principles and embodiments 
ofthepresentinventionwillbedescribedwithreference 

to the accompanying drawings. 
25 [OVERVIEW] 

Decreasing the supply voltage, V dd , is the 
eas ies t way to reduce power consumption in CMOS circuits . 
However, reducing V w degrades circuit speed. Lowering 
transistor threshold voltage helps to recover the speed 

30 degradation; however sub-threshold leakage current 
increases exponentially with threshold voltage. In an 
embodiment of the present invention, we propose a new 
circuit technique using differential pass transxstor 
logic, a low voltage swing and charge recycling to save 

35 power. In this configuration, we only reduce the swing 
voltage and do not reduce supply voltage, thereby 
maintaining the transistor device current and avoiding 
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speed degradation. We evaluated the 32bit adder using 
this configuration and confirmed 53% better power-del ay 
product performance compared with ordinary CIA adder. 
SOI devices can avoid the body effect to long pass 
5 transistor network and improve the speed at lower supply 
voltage ♦ 

[LOW SWING VOLTAGE DRIVE AND CHARGE RECYCLING METHODS] 

Dynamic power consumption in CMOS is 
proportional to the supply voltage V w and the swing 
10 voltage V sw . a typical CMOS circuit is configured such 
that the swing voltage is equal to the supply voltage. 
In this case, the dynamic power consumption is 
proportional to the square of the supply voltage. As 
the swing voltage is lower than the supply voltage, 
15 the power consumption is reduced accordingly. 

We will explain the basic concept of the low 
swing voltage drive, referring to Fig.l. As shown in 
Fig.l, the swing voltage is reduced to v^, where 
represents a threshold voltage of the transistor. 
20 Since the supply voltage drives each of transistors , 
the drive current is not reduced, and thus resulting 
in low power consumption without circuit speed 

degradation . 

Another way to reduce power consumption is 

25 to use a charge recycling method. The charge recycling 
method intends to reduce power consumption by reusing 
the charge once used. This allows making use of the 
charge otherwise havingbeen discarded in order to reduce 
the power consumption. Fig. 2 shows the basic concept 

30 of the charge recycling technique in combination with, 
a complementary logic , which is employed in embodiments 
of the present invention. The left half of Fig. 2 shows 
the conventional logic without charge recycling, where 
in an initial state one node and the other node 

3 5 complementary to each other stores charge Q . When the 
logic is activated, either one of the nodes is grounded 
so that the charge stored in that node is discharged. 



As a result, a voltage difference is present between 
the both nodes and a logic state can be detected, in 
the final state, the node having lost the charge is 
provided with the charge Q by the supply such that the 
node can recover its initial state. 

On the contrary, if the charge recycling is 
used in the logic, the both nodes store the similar 
charge Q/2 in the initial state. When the logic is 
activated, one of the both nodes is provided with a 
further charge Q/2 from the supply, and the other one 
is grounded so as to pass the charge to the ground. 
As a result, in a drive state, the both nodes store 
the same charge as that of the conventional method. 
In the final state, the both nodes are short circuited 
in order to recover the both nodes just the same as 
the initial state. In effect, a switch consumes the 
power when the switch is conductive after being turn 
ON. However, as can be seen from Fig. 2, the charge 
recycling logic reduces the power consumption by half 
as compared to that of the conventional logic, because 
the charge to be provided to the charge recycling logic 
from the supply amounts to just a half of that for the 
conventional logic . 

[CONVENTIONAL LOW SWING CIRCUIT AND INVENTIVE LOW SWING 
CHARGE RECYCLING LOGIC] 

Fig. 3 shows a structure of an XOR gate 
implemented in the conventional low swing dynamic 
differential circuit and Fig. 4 shows a corresponding 
structure of a low swing charge recycling (LSCR) XOR 
gate according to an embodiment of the pres ent invention ♦ 
Moreover, Fig. 5 shows a timing diagram of the 
conventional structure and Fig. 6 shows a timing diagram 
of the low swing charge recycling (LSCR) structure 
according to the embodiment of the present invention. 

In the conventional structure, the XOR gate 
includes a low swing voltage driver 100 and a 
complementary nMOS pass gate circuit 200 determining 



logical operations . When CLK=1 , the low swing voltage 
driver 100 is activated and it drives the source of 
the pass gate circuit 200 by the low swing voltage, 
v dd~ v th- In response to the input signal to the pass 
gate 200 , complementary signals OUT and OUT are pulled 
to v dd~ v t^ and ground level, or vice-versa, as can be 
seen from Fig* 5, 

The out and OUT signals can be connected to 
succeeding stages of pass gates. Since the driver 100 
is dynamically activated, one node is pulled to V^-V^ 
and the other node complementary to the one node is 
left floating at the ground level. When CLK=0, both 
outputs are discharged to ground. In practical 
datapath circuits , the pass gate chain can be very long, 
many wires need to be wired through the datapath, and 
many signals couples with the long floating node, making 
protection of the dynamic node from capacitive coupling 
and/or signaling delay difficult . Capacitive coupling 
into the dynamic node can degrade speed and/or cause 
operational failure. Since the conventional low swing 
structure particularly uses the dynamic low swing driver , 
the conventional structure is called as Low Swing Dynamic 
Driver (LSDD) structure . 

On the contrary, according to the embodiment 
of the present invention, the LSCR logic includes a 
low swing voltage driver 1, a complementary nMOS pass 
gate circuit 2 for determining the logical operations , 
and a equalization transistor (equalizer) 3 for 
performing charge recycling. When CLK=1 , the low swing 
voltage driver 1 is activated and the complementary 
signals OUT and OUT are pulled to V^-v^ and ground 
level, or vice-versa, in the same way as the LSDD 
structure . 

Note that, in the LSCR structure, evaluation 
is static and all nodes are actively driven. Since the 
driver 1 is static*, all the nodes are driven by the 
supply or the ground during the evaluation, thus there 
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exists no floating nodes- Therefore, in the LSCR 
structure, the operational failure and/or signaling 
delay can be rather avoided. 

When CLK=0, the tri-state gate driving into 
5 nMOS pass gates are shut off and the equalization 
transistor (FET) 3 is activated, resulting in charge 
sharingbetween OUT and OUT nodes . As is shown in Fig . 6 , 
after charge sharing occurs , both nodes are precharged 
to approximately the intermediate voltage, such as 

1 0 (V^-Vrt) / 2 , between V^-V^ and ground . Furthermore , it 
can be seen that the effective logic signal swing is 
reduced to a half of that for the conventional LSDD 
structure, and the power consumption reduction can be 
achieved . In both LSDD and LSCR structures , the output 

15 from the final stage of the XOR gate is latched in a 
differential sense amplifier latch circuit or 
flip-flop. 

[OPERATION OF LOW SWING CHARGE RECYCLING CMOS CIRCUIT 
ARRANGEMENT] 

20 as shown in Fig. 4, the low swing driver 1, 

which limits the swing voltage, is activated by the 
clocking signal CLK. The signals driven by the driver 
are connected to the pass gate network 2 . The outputs 
of the driver 1 are connected to the source or drain 

25 of transistors and the gate of these transistors are 
driven by normal swing voltage . There are two branches 
of the pass gate network 2 . One is driven by true signals 
and the other is driven by compliment signals . The node 
in one branch always has the corresponding node. One 
30 node such as node 4 represents the true signal, the 
other node such as node 5 represents the complacent 
signal. 

These two nodes 4 and 5 are connected by a 
transistor switch (equalizing transistor 3) . The node 
35 4 is connected to the source of the transistor 3 and 
the node 5 to the drain of the transistor 3 . The gate 
of the transistor 3 is driven by the clock signal 
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(equalizing signal) . This clock signal is the 
compliment of the clocking signal in the driver gate. 
In other words, when the driver 1 drives the signals 
into the pass gate network 2 , the equalizing clock turns 
5 off, the equalizing signal turns on the equalizing 
transistor 3 and the charge stored in the nodes 4 and 
5 are shared. The voltage of the nodes 4 and 5 becomes 
the same. 

As can be seen from Fig. 6, the circuit 
10 operation consists of two phases , an equalization phase 
and an evaluation phase . During the equalization phase , 
the voltage of two nodes 4 and 5 are equalized by turning 
on the equal i z ing trans is tor 3 . And the low swing driver 
1 is disconnected from these nodes by turning off the 
15 driver's clocking. When the circuit is evaluated, 
equalization transistor 3 is turned off and the low 
swing driver 1 drives the pass gate network 2 . One of 
the two nodes 4 and 5 is pulled up to higher voltage 
(Vdd-Vth) and the other is pulled down to ground. The 
20 difference of the voltage of these two nodes is detected 
at the following stage such as the sense amplifier 
circuit that is usually provided at the following stage . 
[STRUCTURE OF LOW SWING DRIVER] 

The low swing drivers according to the present 
25 invention canbe roughly categorized into three classes , 
as shown in Figs. 7, 8, and 9. Fig. 7 shows a structure 
of a low swing driver according of a first embodiment 
to the present invention, in which the swing level may 
change from ground level to V^-V^. Fig. 8 shows a 
30 structure of the low swing driver according of a second 
embodiment to the present invention, in which the swing 
level may change from to Vdd . Fig. 9 shows a structure 
of the low swing driver according of a third embodiment 
to the present invention, in which the swing level 
35 changes from V* to V^-V^. In each of the driver 

configurations as shown in Figs. 7 to 9, there are type 
1 and type 2 drivers. The type 1 driver comprises a 
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plurality of transistors connected in series . The type 
2 driver includes transistor inverters being clipped 

by pass gates. 

Since the type 1 low swing driver of the first 
structure, as shown in Fig. 7, may be considered the 
typical one, it is applied to the various embedments 
of the present invention. 

r l-BIT ADDER] 

Referring to Fig- 10 , a 1-bit adder according 
to another embodiment of the present invention Will 
be explained, in which the inventive low swing charge 
recycling technic is applied to the 1-bit adder. 
Fig 10 shows a structure of the 1-bit adder 10 according 
to the enfcodiment . The 1-bit adder 10 comprises a logic 
15 pass gate circuit. The 1-bit adder 10 recedes carry 
input signals and <?Z that are driven by the low 

swing driver 1 (not shown in Fig. 10) as well as generate 
signals G, and G, . When the low swing driver 1 is 
deactivated, the carry input signals correspond to 
20 complementary signals that are charge 

equalizer 3 (not shown in Fig. 10) . In the 1-bit adder 

10, the generate.signals Gi and O, are converted to low 
swing charge sharing internal generate signals by a 
low Ling driver 13 and a equalization circuit £ 
A gate of an nMOS pass gate 11 in the 1 bit 

adder 10 is controlled by propagate signals P t and P t . 
Carry output signals Ci and C, may be transferred to 
a succeeding stage as the carry input signals M.f the 
1-bit adders are connected in series. Sum signals S, 
30 and S t are formed by applying XOR operation to the true 
carry input signal with the true propagate signal and 
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the compliment carry input signal and the compliment 
propagate signal. 

The gene rate s ignal Gi and the propagate s ignal 
Pi can be written as : 
Gi^Ai AND Bi 
P A =Ai XOR Bi 

where both Ai and Bi represent input signals to the 1-bit 
adder 10 . It is noted that the logic swing of the signals 
A if Bi, Gi and Pi is V^. 
1 4— BIT ADDER MODULE] 

Referring to Fig. 11 , a 4-bit carry skip adder 
module according to an embodiment of the present 
invention will be explained. Fig. 11 shows a structure 
of the 4-bit carry skip adder module implementing LSCR 
technology. The 4-bit carry skip adder of Fig, 11 can 
be easily formed by connecting four 1-bit adders 10 
of the previous embodiment shown in Fig. 10. 

For the purpose of designing carry skip, the 
4-bit carry skip adder module 20 includes the four 1-bit 
adders 10, carry propagating paths 21 and 22 for 
propagating the carry bits in series and bypass paths 
23 and 24. Each of the bypass paths 23 and 24 includes 
a bypass transistor 25 and 26 , respectively, for passing 
the carry signals directly in order to reduce delay 
due to the carry propagating paths 21 and 22. In this 
case , the conflict between the carry signals propagating 
along the carry propagating paths and the bypass carry 
signals passing through the bypath paths may probably 
increase the propagation delay. Therefore, in this 
embodiment, the transistor 12 on the carry propagating 
path 21 and the transistor 25 on the bypass path 23 
are exclusively gated in order to avoid the conflict. 

In this embodiment, the carry input signals 

C in and C & are applied to the 4-bit carry skip adder 

module 20 by means of the low swing driver 1 (not shown 
in Fig. 11) . When the low swing driver 1 is activated, 



the carry input signals, which are low swing voltage 
controlled, are applied to the module 20. Otherwise, 
an equalizer (equalizing transistor) 27 is activated 
and the charge sharing signals are applied to the module 

20 as the carry input signals C in and C fe . 

For each bit of the 4-bit adder module 20 , 

generate signals G 0 and G 0 , Gi and G v , and so on are 

applied to the 4-bit adder module 20 as the low swing 
charge recycling signals , for example, by means of a 
low swing tri-state inverter 13 and an equalizer 14. 

Although the carry signals propagating 
through the 4-bit adder module 20 are low swing signals , 
the equal i z ing trans is tor 2 7 is not neces s ar i ly provided 
at every low swing node, but is provided at the output 
of the low swing driver 1. This is because since the 
pass transistors aregatedby the complementary signals , 
shorting circuit of a source of one of the transistors 
is sufficient to lead to the low swing at the other 
transistor. 

It is noted that the differential sense 
amplifier latch at the final stage is not shown in Fig. 11 
for clarification of the drawing. 

The carry signals propagate through the 

module 20 and differential voltage between C oat and C out 

is amplifiedandlatchedinthe sense amplifier flip-flop 
(not shown in Fig. 11) . Note that there is no contention 
between the bypass transistor 25 gated by P 0 PiP 2 P3 and 
the local carry chain. The sum signals are generated 
by pass gate XORs and are also latched in the sense 
amplifier flip-flops (not shown in Fig. 11) • 
[MULTI-BIT ADDER] 

A multi-bit adder 50 formed of any number of 
bits can be designed by serially connecting the 
above-described 4-bit carry skip adder modules 20 . 
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Fig. 12 shows a structure of the multi-bit adder according 
to an embodiment of the present invention. As shown 
in Fig. 12, the multi-bit adder 50 includes low swing 
drivers 40 for receiving full swing complementary carry 
5 input signals and generating the low swing complementary 
carry input signals followed by the 4-bit adder modules 
20 connected in series. The respective output of each 
4~bit adder module 20 is connected to the differential 
sense amplifier latch circuit (SA-FF) 30, which 

!;j 10 generates bit sum signals Si and 5,. of the full swing. 
Q 

Cf! The carry output signals C QUt and are also delivered 

M 



SJ 
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by means of the differential sense amplifier latch 
circuit 30. 

It is noted that the internal carry signals 
q 15 propagating through each 4-bit adder module 20 are low 

ftj swing signals. 

J B | In designing a wider adder, although it is 

p expected that the carry bypass paths be multiplexed 

RJ in order to achieve high-speed circuit, an experimental 

20 64-bit adder was designed by serially connecting 16 
4-bit adder modules 20 according to a further embodiment 
of the present invention. The 64-bit adder was 
fabricated in 0 . 08 jzm SOI CMOS technology. The 64-bit 
adder may be evaluated as a 32-bit adder depending on 
25 the module structure. Fig. 13 shows a photomicrograph 
of the 64-bit adder . The actual size of the 64-bit adder 
with the floating body SOI devices is 23 by 840 fi 
m. 

[APPLICATION TO SOI DEVICE] 

30 Both LSDD and LSCR structures can be designed 

using either of bulk CMOS or SOI CMOS . Those bulk CMOS 
and SOI CMOS heavily use serial pass transistors for 
performing logical operations. As a result , the bulk 
CMOS has the well potential being fixed , and has the 

35 high threshold voltage due to the body effect, resulting 



in the increasing delay. 

In the SOI CMOS, the use of the floating body 
SOI enables the well potential to be raised due to 
coupling of the well potential to the source of the 
device. This allows reducing the body effect and 
suppressing the increasing delay. Fig. 14 shows a graph 
illustrating the delay time for the two 64-bit adders 
using the 0.08 jam SOI CMOS device, in which one adder 
is designed based on the contacted body and the other 
based on the floating body. This simulation shows the 
benefit of the floating body in 0 . 08 urn SOI . Add delay 
is plotted vs. V DD for both 64-bit adders. 

In the 64-bit adder, the critical path of the 
64-bit adder consists of 21 serially connected nMOS 
transistors. In these simulations, the extra 
capacitance associated with the body contact was not 
included in order to show only the benefit of reduced 
body effect in the floating body devices . The addition 
times were estimated using an offset voltage of 100 
mV, which is the signal required by the flip-flop. The 
floating body improves the critical path delay from 
13% at V PD «1.3V to 24% at V OD =0.9V. 

In the floating body SOI devices , the body 
potential significantly changes as it is reflected on 
the history effect - the change in delay due to previous 
circuit activity. In particular, if the pass 
trans istor logic operates in the full swing static logic , 
because both the source and drain voltages range from 
ground level to supply voltage, the body potential 
changes by a large amount and the variation in the delay 
time due to the history effect becomes wider. 

According to the low swing charge recycling 
structure of the present invention, the history effect 
in SOI devices is minimized, because the nMOS pass 
transistor network is equalized every clock cycle, and 
during equalization, source and drain voltages of the 
pass transistors reset to the similar voltage and the 
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swing of the source and drain voltages of the pass 
transistors is limited from ground level to V^-V^ . The 
critical path delay was simulated through 4-bit floating 
body adder module for 100 cycles at 500MHz, and confirmed 
5 that the delay fluctuation is less than 0.2%. 
[REDUCTION OF DRIVE CURRENT DUE TO BODY EFFECT] 

The body effect will be explained in detail. 
Conventionally, the drive current of the transistor 
is calculated, assuming that the source voltage (S) 

10 and the body voltage (B) are equivalent. That is to 
say, it is assumed that in the nMOS configuration, the 
source voltage takes ground level , and in the pMOS 
configuration, the source voltage takes the supply 
voltage . This as sumption may be true if the device forms 

15 a simple inverter. 

However, this is not the case, when the device 
forms a multi-input NAND gate or NOR gate that: includes 
serially connected transistors, or when the serially 
connected transistors such as pass transistor logic 

20 are driven at the source. For example, a source node 
of one nMOS transistor being arranged nearer the output 
side of the 2-input NAND gate is connected to the drain 
terminal of the other transistor. During switching 
operation, the voltage of this source node becomes higher 

2 5 than ground level. If the body voltage of the one 
transistor is also connected to ground level , the source 
voltage becomes higher than the body voltage of the 
one transistor- Accordingly, in effect, the body 
voltage, is reversely biased with respect to the source 

30 voltage toward negative voltage, Furthermore, the 
gate-source voltage becomes lower than the ordinarily 
applied voltage . Therefore , the reversely biased body 
voltage and the lower gate-source voltage cause the 
drive current of the transistor to be reduced. 

35 [BODY EFFECT IN SOI] 

In SOI devices, it is difficult to have an 
access to the body terminal under the conventional SOI * s 
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structure . Even if the access to the body terminal can 
be made, it involves an enlargement of the surface area 
and an increasing parasitic capacitance. So, the 
commonly used SOI device is a floating body SOI device, 
which leaves the body floating. In the floating body 
SOI device, the body voltage is determined based on 
several factors such as (1) a leakage current of a PN 
connection between the source and the body or the drain 
and the body, (2) minority carrier current (i.e., 
substrate current) produced to the end of the drain 
when the current flows through the channel, and (3) 
coupling effect due to the parasitic capacitance at 
the body-gate, body-source, and body-drain, 

respectively . 

However , the body voltage is not kept constant , 
but rather changes depending on the transistor 1 s status . 
The rate of the variation in the body voltage may be 
slower or' faster depending on a fluctuation factor for 
the body voltage. If the body voltage changes due to 
the leakage current, it fluctuates at the rate of 
microseconds order, on one hand, and if it changes due 
to the capacitance coupling, it fluctuates at the rate 
of picoseconds order, on the other. The slow 
fluctuation of the body voltage at the rate of 
microseconds results in the variation of the switching 
property (ex. delay) of each circuit operation. Thus, 
the circuit operation is dependent on the previous 
circuit operations. In other words, the circuit is 
influenced by the history effect. 

A magnitude of the history effect depends on 
the swing of the variation of the body voltage. 
Consequently, since the variation of the body voltage 
depends on the variation of other terminals of the device 
the history effect becomes more significant as the 
magnitude of the variation of the other terminal's 
voltage becomes higher. Particularly, since, in the 
transistor forming a part of the pass gates , both source 
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and drain voltages of the transistor changes from ground 
level to supply voltage level, the history effect is 
more significant than that of the serially connected. 
[HISTORY EFFECT IN SOI TRANSISTOR IMPLEMENTED BY 
. 5 INVENTIVE LOW SWING CHARGE RECYCLING. STRUCTURE] 

According to the various embodiments of the 
present invention, the following two features can reduce 
the history effect. 

Firstly, the nMOS transistors performing 
10 logical operations are formed by the pass gates . Since 
!*! the source and drain voltages are driven in accordance 

U\ with the low swing controlled voltage, the magnitude 

f s ] of the variation of the body voltage can be reduced. 

Secondarily, the source and drain voltages 
15 in the pass gates are reset to approximately the same 
» voltage by each of equalization process before the 

circuit operation, the variation of the voltages.before 
the circuit operation can be reduced. 
[SIMULATION OF VARIOUS ADDERS] 
20 The performance of a 32-bit adder using the 

low swing charge recycling structure according to the 
present invention is compared with other adders and 
with the low swing dynamic differential (LSDD) adder. 
In this example , the 4-bit carry skip scheme is employed 
25 to implement the 32-bit adder. Since the 4-bit adder 
modules are connected in series in order to realize 
the multi-bit adder, as the number of bits increases 
the delay time of the multi-bit adder increases rapidly . 
If the multi-bit adder is configured to have a bit length 
30 more than 32, a carry skip mechanism for a multiple 
bypass structure has to be added to the multi-bit adder . 
Therefore, in this example, the 32-bit configuration 
is employed. 

The following 4 types of 3 2 -bit adders (1) 
35 to (4) are compared with respect to speed (i.e, , delay) 
and power, 

(1) Serially connected carry look ahead (CLA) adder: 



This adder includes 8 4-bit CLA adders 
connected in series. The circuit architecture is 
designed using the conventional static CMOS. This 
architecture is equivalent to that of the embodiment 
of the present invention and uses the conventional full 
swing CMOS. 

(2) Multi-level CLA adder: 

This adder is the conventional CLA adder 
including the 4-bit CLA adders in combination with the 
8-bit CLA adders. The circuit architecture of this 
multi-level CLA adder is equivalent to that of the 
above-described serially connected CLA adder using the 
full swing CMOS. This architecture is called as the 
multi-level CLA in order to distinguish it from the 
serially connected CLA adder. 

(3) LSDD adder: 

This adder consists of a 32-bit LSDD adder. 
This adder includes 8 4-bit carry skip adders connected 
in series. In this adder, the circuit architecture of 
this adder is equivalent to that of the embodiment of 
the present invention, but the signal driving scheme 
is different from that of the present invention. 

(4) LSCR adder: 

This adder is a 32-bit low swing charge 
recycling adder according to the above-described 
embodiment of the present invention. The 32-bit LSCR 
adder is formed by 8 4-bit adder modules connected in 
series . 

Fig. 15 shows a graph illustrating the add 
delay of the various adders as a function of the power 
voltage and Fig. 16 shows a graph illustrating the power 
consumption of the various adders as the function of 
the power voltage. Pseudo-random inputs and a clock 
frequency of 100MHz were used to valuate the consumption 
power . As can be seen from Figs . 15 and 16 , the serially 
connected CLA adder was the worst adder in terms of 
the delay time , and other adders rather than the serially 
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connected CLA adder presented approximately the similar 
performance with respect to the delay time, As to the 
power consumption, the multi-level CLA adder presented 
the worst performance and the serially connected CLA 
5 adder presented the best performance. 

The simulation results for the 
above-described 4 adders are given in Figs. 15 and 16. 
The serially connected CLA adder could reduce the power 
consumption, but introduced the increasing delay time. 
M io In addition, the multi-level CLA adder could reduce 

Jjj the delay time, but consumed the larger power. This 

*j| indicates that the conventional full swing CMOS adder 

is not able to reduce both the delay time and the power 
consumption at the same time. On the contrary, it can 
J 15 be said that the LSDD adder and the LSCR adder, which 

» employs the low swing CMOS, could have achieved not 

only the shorter delay time but also the lower 
consumption power. Particularly, the LSCR adder using 
the charge recycling scheme could have reduced the power 
P 20 consumption lower than that of the multi-level CLA adder 

by 49%, and lower than that of the LSDD adder by 10%. 

Fig. 17 shows a graph comparing the power 
consumptions between LSDD and LSCR techniques. The 
power is consumed in a clock drive portion, as well 
25 as a full swing signal portion and a low swing signal 
portion during the generation of control signals , The 
use of the charge recycling scheme could achieve the 
lower power consumption in the low swing signal portion 
by 24% than that of the LSDD adder. This power 
30 consumption has contributed to the reduction of the 
entire power consumption by 10%. The reason why the 
reduction of the power consumption in the LSCR adder 
is less than the theoretical maximum 50% is primarily 
due to a parasitic component of the added transistors 
35 to the LSDD adder in order to design the LSCR adder 
or the transient leakage current during the clock 
transition. 
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The inventive low power consumption circuit 
arrangement is disclosed, in which the circuit 
arrangement uses the low swing logic scheme and the 
charge recycling scheme . The disclosed arrangement can 
reduce the power consumption without introducing a delay 
time penalty by lowering not the supply voltage but 
only the swing voltage. This inventive circuit 
arrangement was applied to the adder and the performance 
of the adder was compared with the various adders of 
the different structure and estimated. It has be found 
that the conventional full swing CMOS architecture could 
not suppress both the delay and power consumption at 
all once, and that the adder involving the low swing 
logic technology could have achieved the reduction of 
the delay time and the power consumption at together. 
Furthermore, it has been found that the adder involving 
the charge recycling technology could have reduced the 
entire power consumption lower than that of the 
conventional low swing logic technology by 10%. 

Further , the present invention is not limited 
to these embodiments , but variations and modifications 
may be made without departing from the scope of the 
present invention. 

The present application is based on U.S. 
priority application, which is U. S . provisional patent 
application No. 60/265,989 filed on February 2, 2001, 
the entire contents of which are hereby incorporated 
by reference* 



